Knowledgebase Home Page  >  SearchUnit
Search the Knowledge Base
Indexing a 'Custom DataSet' (VB.NET).
https://keyoti.com/kb/Default.aspx?ToDo=view&questId=247&catId=54

Options

Print this page
Email this to a friend

When data needs to be indexed from a non directly supported source (website, file-system, known database) or when the data needs programmatic treatment first, the indexing can be performed using the CustomDataSetProvider importer.

 

1.       Create a class library project, that will return the data to the indexer.  It only needs one class, in this case “DataProvider”.  (See attached example project).

 

Public Class DataProvider

 

    'the number of rows in our dataset (we're going to generate them on the fly, since it's not real data)

    Dim numberOfRowsToCreate = 2000

 

    Public Function GetDataSet(ByVal firstRow As Integer, ByVal numberOfRows As Integer) As DataSet

        ' 'retrieve' required data

 

        'create a DataSet to return, with one table

        Dim ds As New DataSet

        Dim table As New DataTable

        Dim data(2) As Object

        Dim i As Int32

        Dim r As New Random

        ds.Tables.Add(table)

 

        '3 basic columns

        table.Columns.Add("id", i.GetType)

        table.Columns.Add("test")

        table.Columns.Add("random")

 

        'if the indexer is asking for rows that don't exist in our source (or in this

        'example, we wont create) then return nothing to tell it to stop.

        If firstRow >= numberOfRowsToCreate Then Return Nothing

 

        'generate the data requested, fill in the rows with made up data

        While firstRow + i < numberOfRowsToCreate And i < numberOfRows

            data(0) = i + firstRow

            data(1) = "This is row number " & i

            data(2) = r.Next(999999)

 

            table.Rows.Add(data)

            i = i + 1

        End While

 

        'return the dataset to the indexer

        Return ds

    End Function

End Class

 

2.       Build the project, to create a DLL, in this case “CustomDataSetImport.dll”

3.       Import the data using the following import parameters:

 

“Assembly path” = relative (to the index directory) or absolute path to the dll, eg. ..\bin\Debug\CustomDataSetImport.dll (if the index directory is off of the project directory, as in our example attached).

“Full class name” = the full classname including namespace, eg. CustomDataSetImport.DataProvider

“Unique field” = the column name in the dataset generated that holds a unique identifier, in our example it’s called “id”.

 

 

Alternatively, to import programmatically, use the following code

 

Dim importer As DocumentIndex = New DocumentIndex(Configuration)

Dim loc As String = ". ..\bin\Debug\CustomDataSetImport.dll"
Dim query As String = "CustomDataSetImport.DataProvider"
importer.Import(IndexableSourceRecordFactory.CreateRecord(SourceType.CustomDataSetProvider, loc, query, "id""http://localhost/recordViewer.aspx?key={0}&keyField={1}"))
importer.Close

 

 

About the GetDataSet method:

 

When you import from a CustomDataSetProvider, it loads your provider class (above) and then it repeatedly calls the GetDataSet method until it either gets Nothing returned or no new data).

 

It’s intended for you to check the firstRow and numberOfRows args in that method, and only return the rows corresponding to that.  Similar to pages of data.

 

Eg. (much smaller example) if your database has this in it

 

field1 | field 2

60007     xxxxx

60008     aaaaa

60009     wwwww

60101     qqqqq

60102     fffff

60103     ppppp

60220     kkkkk

60221     lllll

 

and if we were only requesting a small page size (actually we request 500 rows at a time, but in this example make it 5) - then the params may be

 

firstRow=0 and numberOfRows=5

 

then you should return a DataSet with

 

60007     xxxxx

60008     aaaaa

60009     wwwww

60101     qqqqq

60102     fffff

 

and then you'll find that method called again but with these params

 

firstRow=5 and numberOfRows=5

 

and this time you return

 

60103     ppppp

60220     kkkkk

60221     lllll

 

The only other situation you need to consider is if it has called for rows that don't exist, eg.

 

firstRow=10 and numberOfRows=5

 

then you should return Nothing, which tells it to stop.

 

 


Related Questions:

Attachments: